Summarizing Encyclopedic Term Descriptions on the Web

نویسندگان

  • Atsushi Fujii
  • Tetsuya Ishikawa
چکیده

We are developing an automatic method to compile an encyclopedic corpus from the Web. In our previous work, paragraph-style descriptions for a term are extracted from Web pages and organized based on domains. However, these descriptions are independent and do not comprise a condensed text as in hand-crafted encyclopedias. To resolve this problem, we propose a summarization method, which produces a single text from multiple descriptions. The resultant summary concisely describes a term from different viewpoints. We also show the effectiveness of our method by means of experiments.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Oganizing Encyclopedic Knowledge based on the Web and its Application to Question Answering

We propose a method to generate large-scale encyclopedic knowledge, which is valuable for much NLP research, based on the Web. We first search the Web for pages containing a term in question. Then we use linguistic patterns and HTML structures to extract text fragments describing the term. Finally, we organize extracted term descriptions based on word senses and domains. In addition, we apply a...

متن کامل

Producing a Large-scale Encyclopedic Corpus over the Web

Encyclopedias, which describe general/technical terms, are valuable language resources (LRs). As with other types of LRs relying on human introspection and supervision, constructing encyclopedias is quite expensive. To resolve this problem, we automatically produced a large-scale encyclopedic corpus over the World Wide Web. We first searched the Web for pages containing a term in question. Then...

متن کامل

Image Retrieval and Disambiguation for Encyclopedic Web Search

To produce multimedia encyclopedic content, we propose a method to search the Web for images associated with a specific word sense. We use text in an HTML file which links to an image as a pseudocaption for the image and perform text-based indexing and retrieval. We use term descriptions in a Web search site called “CYCLONE” as queries and correspond images and texts based on word senses.

متن کامل

Organizing Encyclopedic Knowledge Based On The Web And Its Application To Question Answering

We propose a method to generate large-scale encyclopedic knowledge, which is valuable for much NLP research, based on the Web. We first search the Web for pages containing a term in question. Then we use linguistic patterns and HTML structures to extract text fragments describing the term. Finally, we organize extracted term descriptions based on word senses and domains. In addition, we apply a...

متن کامل

Web Mining for Compiling and Accessing Encyclopedic Contents

Reflecting the growth and diversity of information on the Web, it has become common to consult search engines and portal sites for various topics. This paper describes a retrieval system called “Cyclone”, which is intended to enhance the utility of the Web as an encyclopedia. Cyclone compiles an encyclopedic content consisting of term descriptions and related terms automatically. The number of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره cs.CL/0407026  شماره 

صفحات  -

تاریخ انتشار 2004